46 research outputs found
Tight on Budget? Tight Bounds for r-Fold Approximate Differential Privacy
Many applications, such as anonymous communication systems, privacy-enhancing database queries, or privacy-enhancing machine-learning methods, require robust guarantees under thousands and sometimes millions of observations. The notion of r-fold approximate differential privacy (ADP) offers a well-established framework with a precise characterization of the degree of privacy after r observations of an attacker. However, existing bounds for r-fold ADP are loose and, if used for estimating the required degree of noise for an application, can lead to over-cautious choices for perturbation randomness and thus to suboptimal utility or overly high costs.
We present a numerical and widely applicable method for capturing the privacy loss of differentially private mechanisms under composition, which we call privacy buckets. With privacy buckets we compute provable upper and lower bounds for ADP for a given number of observations. We compare our bounds with state-of-the-art bounds for r-fold ADP, including Kairouz, Oh, and Viswanath\u27s composition theorem (KOV), concentrated differential privacy and the moments accountant. While KOV proved optimal bounds for heterogeneous adaptive k-fold composition, we show that for concrete sequences of mechanisms tighter bounds can be derived by taking the mechanisms\u27 structure into account. We compare previous bounds for the Laplace mechanism, the Gauss mechanism, for a timing leakage reduction mechanism, and for the stochastic gradient descent and we significantly improve over their results (except that we match the KOV bound for the Laplace mechanism, for which it seems tight). Our lower bounds almost meet our upper bounds, showing that no significantly tighter bounds are possible
S-GBDT: Frugal Differentially Private Gradient Boosting Decision Trees
Privacy-preserving learning of gradient boosting decision trees (GBDT) has
the potential for strong utility-privacy tradeoffs for tabular data, such as
census data or medical meta data: classical GBDT learners can extract
non-linear patterns from small sized datasets. The state-of-the-art notion for
provable privacy-properties is differential privacy, which requires that the
impact of single data points is limited and deniable. We introduce a novel
differentially private GBDT learner and utilize four main techniques to improve
the utility-privacy tradeoff. (1) We use an improved noise scaling approach
with tighter accounting of privacy leakage of a decision tree leaf compared to
prior work, resulting in noise that in expectation scales with , for
data points. (2) We integrate individual R\'enyi filters to our method to
learn from data points that have been underutilized during an iterative
training process, which -- potentially of independent interest -- results in a
natural yet effective insight to learning streams of non-i.i.d. data. (3) We
incorporate the concept of random decision tree splits to concentrate privacy
budget on learning leaves. (4) We deploy subsampling for privacy amplification.
Our evaluation shows for the Abalone dataset ( training data points) a
-score of for , which the closest prior work only
achieved for . On the Adult dataset ( training data
points) we achieve test error of for which the
closest prior work only achieved for . For the Abalone dataset
for we achieve -score of which is very close to
the -score of for the nonprivate version of GBDT. For the Adult
dataset for we achieve test error which is very
close to the test error of the nonprivate version of GBDT.Comment: The first two authors equally contributed to this wor
Divide and Funnel: a Scaling Technique for Mix-Networks
While many anonymous communication (AC) protocols have been proposed to provide anonymity over the internet, scaling to a large number of users while remaining provably secure is challenging. We tackle this challenge by proposing a new scaling technique to improve the scalability/anonymity of AC protocols that distributes the computational load over many nodes without completely disconnecting the paths different messages take through the network.
We demonstrate that our scaling technique is useful and practical through a core sample AC protocol, Streams, that
offers provable security guarantees and scales for a million messages. The scaling technique ensures that each node in the system does the computation-heavy public key operation only for a tiny fraction of the total messages routed through the Streams network while maximizing the mixing/shuffling in every round.
We demonstrate Streams\u27 performance through a prototype implementation. Our results show that Streams can scale well even if the system has a load of one million messages at any point in time. Streams maintains a latency of seconds while offering provable ``one-in-a-billion\u27\u27 unlinkability,
and can be leveraged for applications such as anonymous microblogging and network-level anonymity for blockchains. We also illustrate by examples that our scaling technique can be useful to many other AC protocols to improve their scalability and privacy, and can be interesting to protocol developers
(Nothing else) MATor(s): Monitoring the Anonymity of Tor\u27s Path Selection
In this paper we present MATor: a framework for rigorously assessing the degree of anonymity in the Tor network. The framework explicitly addresses how user anonymity is impacted by real-life characteristics of actually deployed Tor, such as its path selection algorithm, Tor consensus data, and the preferences and the connections of the user. The anonymity assessment is based on rigorous anonymity bounds that are derived in an extension of the AnoA framework (IEEE CSF 2013). We show how to apply MATor on Tor\u27s publicly available consensus and server descriptor data, thereby realizing the first real-time anonymity monitor. Based on experimental evaluations of this anonymity monitor on Tor Metrics data, we propose an alternative path selection algorithm that provides stronger anonymity guarantees without decreasing the overall performance of the Tor network
Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency---Choose Two
This work investigates the fundamental constraints of anonymous communication (AC) protocols.
We analyze the relationship between bandwidth overhead, latency overhead, and sender anonymity or recipient anonymity against a global passive (network-level) adversary.
We confirm the trilemma
that an AC protocol can only achieve two out of the following three properties:
strong anonymity (i.e., anonymity up to a negligible chance),
low bandwidth overhead, and low latency overhead.
We further study anonymity against a stronger global passive adversary that can additionally passively compromise some of the AC protocol nodes.
For a given number of compromised nodes,
we derive as a necessary constraint a relationship between bandwidth and latency overhead whose violation make it impossible for an AC protocol to achieve strong anonymity.
We analyze prominent AC protocols from the literature and depict to which extent those satisfy our necessary constraints.
Our fundamental necessary constraints offer a guideline not only for improving existing AC systems but also for designing novel AC protocols with non-traditional bandwidth and latency overhead choices
A Toolchain for Privacy-Preserving Distributed Aggregation on Edge-Devices
Valuable insights, such as frequently visited environments in the wake of the
COVID-19 pandemic, can oftentimes only be gained by analyzing sensitive data
spread across edge-devices like smartphones. To facilitate such an analysis, we
present a toolchain for a distributed, privacy-preserving aggregation of local
data by taking the limited resources of edge-devices into account. The
distributed aggregation is based on secure summation and simultaneously
satisfies the notion of differential privacy. In this way, other parties can
neither learn the sensitive data of single clients nor a single client's
influence on the final result. We perform an evaluation of the power
consumption, the running time and the bandwidth overhead on real as well as
simulated devices and demonstrate the flexibility of our toolchain by
presenting an extension of the summation of histograms to distributed
clustering
MixFlow: Assessing Mixnets Anonymity with Contrastive Architectures and Semantic Network Information
Traffic correlation attacks have illustrated challenges with protecting communication meta-data, yet short flows as in messaging applications like Signal have been protected by practical Mixnets such as Loopix from prior traffic correlation attacks. This paper introduces a novel traffic correlation attack against short-flow applications like Signal that are tunneled through practical Mixnets like Loopix. We propose the MixFlow model, an approach for analyzing the unlinkability of communications through Mix networks. As a prominent example, we do our analysis on Loopix.
The MixFlow is a contrastive model that looks for semantic relationships between entry and exit flows, even if the traffic is tunneled through Mixnets that protect meta-data like Loopix via Poisson mixing delay and cover traffic.
We use the MixFlow model to evaluate the resistance of Loopix Mix networks against an adversary that observes only the inflow and outflow of Mixnet and tries to correlate communication flows. Our experiments indicate that the MixFlow model is exceptionally proficient in connecting end-to-end flows, even when the Poison delay and cover traffic are increased. These findings challenge the conventional notion that adding Poisson mixing delay and cover traffic can obscure the metadata patterns and relationships between communicating parties. Despite the implementation of Poisson mixing countermeasures in Mixnets, MixFlow is still capable of effectively linking end-to-end flows, enabling the extraction of meta-information and correlation between inflows and outflows. Our findings have important implications for existing Poisson-mixing techniques and open up new opportunities for analyzing the anonymity and unlinkability of communication protocols
AnoA: A Framework For Analyzing Anonymous Communication Protocols
Anonymous communication (AC) protocols such as the widely used Tor network have been designed to provide anonymity over the Internet to their participating users. While AC protocols have been the subject of several security and anonymity analyses in the last years, there still does not exist a framework for analyzing complex systems, such as Tor, and their different anonymity properties in a unified manner.
In this work we present AnoA: a generic framework for defining, analyzing, and quantifying anonymity properties for AC protocols. In addition to quantifying the (additive) advantage of an adversary in an indistinguishability-based definition, AnoA uses a multiplicative factor, inspired from differential privacy. AnoA enables a unified quantitative analysis of well-established anonymity properties, such as sender anonymity, sender unlinkability, and relationship anonymity. AnoA modularly specifies adversarial capabilities by a simple wrapper-construction, called adversary classes. We examine the structure of these adversary classes and identify conditions under which it suffices to establish anonymity guarantees for single messages in order to derive guarantees for arbitrarily many messages. We coin this condition single-challenge reducability. This then leads us to the definition of Plug\u27n\u27Play adversary classes (PAC), which are easy to use, expressive, and single-challenge reducable. Additionally, we show that our framework is compatible with the universal composability (UC) framework. Leveraging a recent security proof about Tor, we illustrate how to apply AnoA to a simplified version of Tor against passive adversaries